58 research outputs found
The Topological Non-connectivity Threshold and magnetic phase transitions in classical anisotropic long-range interacting spin system
We analyze from the dynamical point of view the classical characteristics of
the Topological Non-connectivity Threshold (TNT), recently introduced in
F.Borgonovi, G.L.Celardo, M.Maianti, E.Pedersoli, J.Stat.Phys.,116,516(2004).
This shows interesting connections among Topology, Dynamics, and
Thermo-Statistics of ferro/paramagnetic phase transition in classical spin
systems, due to the combined effect of anisotropy and long-range interactions.Comment: 6 revtex pages, 4 .eps figures Contribution presented at the 3rd
Conference NEXT-Sigma-Phi News, Expectations, and Trends in Statistical
Physics, August 13-18 2005, Kolymbari, Crete. For related results see also
cond-mat/0402270 cond-mat/0410119 cond-mat/0505209 cond-mat/0506233
cond-mat/051007
Mastering the Spatio-Temporal Knowledge Discovery Process
The thesis addresses a topic of great importance: a framework for data mining positioning data collected by personal mobile devices.
The main contribution of this thesis is the creation of a theoretical and practical framework in order to manage the complex Knowledge discovery process on mobility data. Hence the creation of such framework leads to the integration of very different aspects of the process with their assumptions and requirements. The result is a homogeneous system which gives the possibility to exploit the power of all the components with the same flexibilities of a database such as a new way to use the ontology for an automatic reasoning on trajectory data. Furthermore two extensions are invented and developed and then integrated in the system to confirm the extensibility of it: a innovative way to reconstruct the trajectories considering the uncertainty of the path followed and a Location prediction algorithm called WhereNext.
Another important contribution of the thesis is the experimentation on a real case of study on analysis of mobility data. It has been shown the usefulness of the system for a mobility manager who is provided with a knowledge discovery framework
ConQueSt: a Constraint-based Querying System for Exploratory Pattern Discovery
Il contributo di questa tesi è il disegno e lo sviluppo di un sistema di Knoledge Discovery denominato ConQueSt.
Basato sul paradigma del Pattern Discovery guidato dai vincoli, ConQueSt segue la visione dell’Inductive Database:
• il mining è visto come forma più complessa di querying,
• il sistema quindi è equipaggiato con un data mining query language, e strettamente collegato con un DBMS
• i pattern estratti con query di mining diventano cittadini di prima classe e, seguendo il principio di chiusura, vengono materializzati accanto ai dati nel DBMS.
ConQueSt è già stato presentato con successo al workshop internazionale della comunità IDB, e alla prestigiosa conferenza IEEE International Conference on Data Mining Engineering (ICDE 2006). A giugno sarà presentato alla conferenaz italiana di basi di dati (SEBD 2006). E’ attualmente in corso la sottomissione ad una prestigiosa rivista
A workflow language for research e-infrastructures
AbstractResearch e-infrastructures are "systems of systems," patchworks of resources such as tools and services, which change over time to address the evolving needs of the scientific process. In such environments, researchers carry out their scientific process in terms of sequences of actions that mainly include invocation of web services, user interaction with web applications, user download and use of shared software libraries/tools. The resulting workflows are intended to generate new research products (articles, datasets, methods, etc.) out of existing ones. Sharing a digital and executable representation of such workflows with other scientists would enforce Open Science publishing principles of "reproducibility of science" and "transparent assessment of science." This work presents HyWare, a language and execution platform capable of representing scientific processes in highly heterogeneous research e-infrastructures in terms of so-called hybrid workflows. Hybrid workflows can express sequences of "manually executable actions," i.e., formal descriptions guiding users to repeat a reasoning, protocol or manual procedure, and "machine-executable actions," i.e., encoding of the automated execution of one (or more) web services. An HyWare execution platform enables scientists to (i) create and share workflows out of a given action set (as defined by the users to match e-infrastructure needs) and (ii) execute hybrid workflows making sure input/output of the actions flow properly across manual and automated actions. The HyWare language and platform can be implemented as an extension of well-known workflow languages and platforms
PRUDEnce: A system for assessing privacy risk vs utility in data sharing ecosystems
Data describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data
How you move reveals who you are: understanding human behavior by analyzing trajectory data
The widespread use of mobile devices is producing a huge amount of trajectory data, making the discovery of movement patterns possible, which are crucial for understanding human behavior. Significant advances have been made with regard to knowledge discovery, but the process now needs to be extended bearing in mind the emerging field of behavior informatics. This paper describes the formalization of a semantic-enriched KDD process for supporting meaningful pattern interpretations of human behavior. Our approach is based on the integration of inductive reasoning (movement pattern discovery) and deductive reasoning (human behavior inference). We describe the implemented Athena system, which supports such a process, along with the experimental results on two different application domains related to traffic and recreation management
A global descriptor of spatial pattern interaction in the galaxy distribution
We present the function J as a morphological descriptor for point patterns
formed by the distribution of galaxies in the Universe. This function was
recently introduced in the field of spatial statistics, and is based on the
nearest neighbor distribution and the void probability function. The J
descriptor allows to distinguish clustered (i.e. correlated) from ``regular''
(i.e. anti-correlated) point distributions. We outline the theoretical
foundations of the method, perform tests with a Matern cluster process as an
idealised model of galaxy clustering, and apply the descriptor to galaxies and
loose groups in the Perseus-Pisces Survey. A comparison with mock-samples
extracted from a mixed dark matter simulation shows that the J descriptor can
be profitably used to constrain (in this case reject) viable models of cosmic
structure formation.Comment: Significantly enhanced version, 14 pages, LaTeX using epsf, aaspp4, 7
eps-figures, accepted for publication in the Astrophysical Journa
A risk model for privacy in trajectory data
Time sequence data relating to users, such as medical histories and mobility data, are good candidates for data mining, but often contain highly sensitive information. Different methods in privacy-preserving data publishing are utilised to release such private data so that individual records in the released data cannot be re-linked to specific users with a high degree of certainty. These methods provide theoretical worst-case privacy risks as measures of the privacy protection that they offer. However, often with many real-world data the worst-case scenario is too pessimistic and does not provide a realistic view of the privacy risks: the real probability of re-identification is often much lower than the theoretical worst-case risk. In this paper, we propose a novel empirical risk model for privacy which, in relation to the cost of privacy attacks, demonstrates better the practical risks associated with a privacy preserving data release. We show detailed evaluation of the proposed risk model by using k-anonymised real-world mobility data and then, we show how the empirical evaluation of the privacy risk has a different trend in synthetic data describing random movements
Clustering of loose groups and galaxies from the Perseus--Pisces Survey
We investigate the clustering properties of loose groups in the
Perseus--Pisces redshift Survey (PPS). Previous analyses based on CfA and SSRS
surveys led to apparently contradictory results. We investigate the source of
such discrepancies, finding satisfactory explanations for them. Furthermore, we
find a definite signal of group clustering, whose amplitude exceeds the
amplitude of galaxy clustering (,
for the most significant case; distances are
measured in \hMpc). Groups are identified with the adaptive
Friends--Of--Friends (FOF) algorithms HG (Huchra \& Geller 1982) and NW
(Nolthenius \& White 1987), systematically varying all search parameters.
Correlation strenght is especially sensitive to the sky--link (increasing
for stricter normalization ), and to the (depth \mlim of the) galaxy
data. It is only moderately dependent on the galaxy luminosity function
, while it is almost insensitive to the redshift--link (both to
the normalization and to the scaling recipes HG or NW).Comment: 28 pages (LaTeX aasms4 style) + 5 Postscript figures ; ApJ submitted
on May 4th, 1996; group catalogs available upon request
([email protected]
- …